Networks on Silicon: Blessing or Nightmare?

نویسندگان

  • Paul Wielage
  • Kees G. W. Goossens
چکیده

Continuing VLSI technology scaling raises several deep submicron (DSM) problems like relatively slow interconnect, power dissipation and distribution. and signal integrity. Those problems are encountered particularly on long wires for global interconnect. As clockfrequencies increase, scaled wires become relatively slowe1; and on-chip communication will be the limiting performance factor of future chips. We explain why efficiently sharing of the wires for long distance communication is the solution to this problem. We introduce networks on silicon (NoS), that route packets over shared ( semi)global wires. NoS performance is expected to be high, but comes at a cost. Balancing the performance and cost of a NoS is a major challenge, and we believe busses still have a role play. Figure 1. The number of SOk blocks for future process technologies. 1 Technology trend chip [4]. For 50 nm technology, crossing a chip with highly optimized interconnect takes between six and ten clockcycles, clearly invalidating the low-Iatency assumption of today. Hence we must move to system-level architectures that scale with technology. A feasible template for a future-proof architecture is constructed from processing nodes that do not grow in complexity with technology. Instead, as technology scales, the number of these processing nodes on the chip grows. An on-chip communication network then combines these nodes into a SoS [4]. Various publications show that the spanning wires in blocks of 50k gates scale with technology [4, 13]. This means that the aforementioned DSM issues can be handled by CAD tools, assuming their evolutionary improvement. Figure I shows the exponentially increasing amount of such 50k blocks for a large die in subsequent technologies; in 35 nm this number is approximately ten thousand (adapted from [13] and [4]). It remains to find a communication architecture that allows a SoS composed of these blocks cooperate efficiently. VLSI technology scaling has long followed Moore's law. No fundamental barriers have been identified that invalidate this law for at least another decade [12]. Moore's law predicts that chips in 2010 will count over 4 billion transistors, operating in the multi-GHz range. This abundance of transistors will make very complex systems on silicon (SoS) possible. However, challenges at all abstraction levels of design will have to be addressed before such SoSs will become a reality. The three most important deep subrnicron (DSM) challenges, related to all abstraction levels, are: substantial wire delay, controlling power delivery and dissipation, and assuring signal integrity. Until recently, on-chip wiring was cheap. Consequently architectural models have been employed that relied on lowlatency communication to globally share expensive computational resources. Global wire delay stays at best constant under technology scaling and hence these wires become effectively slower compared to a gate delay. For example, for 130 nm technology the reachable distance of a repeated global signal in a clock cycle is no more than the length of a 2 Networks on silicon are inevitable Given the growing demand for and impact of interconnect on system cost and performance, it is worthwhile to optimize the utilization of wires. Ad-hoc global wiring strucProceedings of the Euromicro Symposium on Digital System Design (DSD’02) 0-7695-1790-0/02 $17.00 © 2002 IEEE Figure 2. Structural view of a network on silicon consisting of processing nodes (P) and nodes supporting communication (R, B). only one device has access to the shared medium. An arbitration mechanism is required to order simultaneous accesses. Such functionality is typically performed by a centralized bus arbiter. The performance of a shared-medium bus scales badly. For an increasing number of bus clients (i) individual clients get less bandwidth on average, and (ii) increased capacitive loads and wire length decrease the total bandwidth. A solution that pairs scalable communication performance and minimal interconnect cost is expected from networks on silicon (NoS) where the SoS is considered as a network of components [2, 3, 1]. Figure 2 illustrates the hardware architecture of this concept. The outer components (marked P) exclusively perform processing and storage functions, whereas the inner components (markedB and R) form the NoS and cater to communication needs of the outer components. The basic building blocks of a NoS are routers (R). A router forwards data from its input ports to its output ports in a concurrent fashion. To that end, a router of arity N contains a N x N switch matrix. Data packets make their way through the network based on the routing information in their headers. A link between two routers is implemented by a point-to-point connection. The links typically span medium to long distances ranging from several to over more than twenty millimeters. The actual length depends on the chosen topology of the network. For a mesh topology the links are relatively short, for a torus which is a mesh with wrap-around connections, some links have a length of half the edge of the chip. Links can be optimized for bandwidth, latency, power, or a combination of these, depending on performance requirements. tures often lead to a huge number of wires with an average usage as low as 10% in time [2]. To control cost in this scenario, the wire packing density must be very high, which is not beneficial for the power and delay characteristics. Efficient mechanisms for sharing (semi)-global wires must solve this cost-performance dilemma. In deep submicron technologies, (semi)-global wires need special attention for power, signal-integrity, and performance reasons. In the discussion below we show how special circuit techniques can handle these issues. Such techniques only work, however, when embedded in dedicated communication IP, which provides a more abstract interface. Power is an issue for global interconnect because it costs more energy to send a bit of information over longer the wires. To reduce the communication delay, the energy consumption increases due to bigger drivers. Employing lowswing signaling for the global wires saves up to a factor four in power for these wires [15]. Implementing low-swing signaling requires special circuit techniques. Signal integrity is hampered increasingly by growing capacitive and inductive coupling between wires. Capacitive noise coupling is the result of the large aspect ratio of wires in DSM technologies. Inductive noise coupling becomes more of a problem due to the decreasing transition times. IR dropl in the supply distribution increasingly contributes to the noise. The most effective way to make a connection robust against noise is application of differential signaling [7]. Differential signaling improves both the generation of and sensitivity to noise. The signal propagation delay of an uninterrupted wire grows quadratically with its length; hence from a certain length onwards it is advantageous to partition the wire in segments with repeaters in between. The repeater insertion technique improves bandwidth and latency but at the cost of higher power consumption. Wire delay can be reduced by fat wires with a lower resistance per unit length at the cost of lower wire density. Such wires behave like lossy transmission lines and require drivers with a resistance matched to the transmission line. As a result, we believe that all inter-block communication will be implemented by hard-macro transmitters and receivers, employing low-swing differential signaling, with well-controlled interconnect instead of ad-hoc drivers handled by standard place-and-route tools. In this way, communication links can be realized with predictable performance and DSM robustness. Currently, the prevalent on-chip interconnects are busses [1]. In a bus architecture, devices share a single transmission medium to communicate. At a given time, 3 NoS requirements ISupply voltage drops are caused by high currents (I) flowing through the resistance (R) of the supply network. Since the supply voltage reduces under scaling IR drop worsens. An important characteristic of a future system-Ievel architecture is the separation between computation and com-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pharmacological and non-pharmacological treatments for nightmare disorder.

Interest in the treatment of nightmares has greatly increased over the last several years as research has demonstrated the clinical significance of nightmare disorder. This paper provides an overview of nightmare disorder, its clinical relevance, and the leading treatments that are available. In particular, the paper defines nightmare disorder and then summarize the recent literature examining ...

متن کامل

O-17: Female Genital Mutilation: A Curse or Blessing among Women of Reproductive Age in Nigeria

Background: Female genital mutilation (FGM) practice is mostly carried out by traditional circumcisers, who often play other central roles in communities, such as attending childbirths. Increasingly, FGM is also performed by health care providers. However, FGM is recognized internationally as a violation of the human rights of girls and women. The study investigates a broad cross-cultural study...

متن کامل

Lucid dreaming treatment for nightmares: a pilot study.

BACKGROUND The goal of this pilot study was to evaluate the effects of the cognitive-restructuring technique 'lucid dreaming treatment' (LDT) on chronic nightmares. Becoming lucid (realizing that one is dreaming) during a nightmare allows one to alter the nightmare storyline during the nightmare itself. METHODS After having filled out a sleep and a posttraumatic stress disorder questionnaire,...

متن کامل

Physiological-emotional reactivity to nightmare-related imagery in trauma-exposed persons with chronic nightmares.

Script-driven imagery was used to assess nightmare imagery-evoked physiological-emotional reactivity (heart rate, skin conductance, facial electromyogram, subjective ratings) in trauma-exposed persons suffering from chronic nightmares. Goals were to determine the efficacy of nightmare imagery to evoke physiological-emotional reactivity, correlates (mental health, nightmare characteristics) of r...

متن کامل

Cost-aware Topology Customization of Mesh-based Networks-on-Chip

Nowadays, the growing demand for supporting multiple applications causes to use multiple IPs onto the chip. In fact, finding truly scalable communication architecture will be a critical concern. To this end, the Networks-on-Chip (NoC) paradigm has emerged as a promising solution to on-chip communication challenges within the silicon-based electronics. Many of today’s NoC architectures are based...

متن کامل

The relationship of nightmare frequency and nightmare distress to well-being.

Nightmares can be defined as very disturbing dreams, the events or emotions of which cause the dreamer to wake up. In contrast, unpleasant dreams can be defined in terms of a negative emotional rating of a dream, irrespective of whether or not the emotions or events of the dream woke the dreamer. This study addresses whether frequency of unpleasant dreams is a better index of low well-being tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002